Column data type handling #5

bpintea · 2018-04-30T19:11:37Z

Once a query is executed, the client-app queries the driver for a set of details related to the results it got: how many columns it got, number of rows, types of the columns.
Column type is mapped onto a descriptor, IRD (Implementation Row Descriptor). The details available here are quite numerous: SQL type of data contained in the column, size ("precision"), decimal digits ("scale"), nullable, type name etc., as well as table [base] name, column alias a.s.o.

Generally with DBMSes, the data types can be either fixed SQL data types (ex. INTEGER) or derived, column-specific (like VARCHAR(10), FLOAT(precision, scale)). So, typically, a driver would need to receive from the data source a verbose description of the result set for every query.

However, ES/SQL only supports "fixed" data types currently (i.e. one can not "create" a table/index with specific, non-general types). So the driver takes advantage of this, by querying Elasticsearch for the data types when it first connects to and then cache these (per connection). Then, for each application-specific query result set, the driver will simply refer to the cached types when the app requests details about the columns data types. This is what this PR is mostly about.

Collaterally, the four column data-type concepts - size, dec-digits, octet-len and display size - are finished/fixed.

- make the warnings visible - warn if no MSVC edition is found

+ BUGH()

The 'SYS TYPES' query is now executed right after connectivity check part of SQLDriverConnect(). The result is cached on the connection (es_type array member, one element for each type). When a query is executed and the results processed, the column data type sent by ES will be used to associate the right es_type element to the record associated with the column. This allows stripping the record structure from members that are only used for data type characterization, thus saving memory, as well as "transparency", in case new types are added or existing changed. (Some change would still be needed in the driver, due to way ODBC specifies "size" and "precision" of a column.)

When returning strings to the application, ODBC requires to include, but don't count the 0-terminator. This fixes the counting part (it was counted before)..

make use of cached data types when returning info to application, strip them record structure.

s/x-pack-odbc/elasticsearch-sql-odbc/

Use cached ES return type for these record members, in case they belong to an IRD descriptor.

Use the values provided by server in cached ES/SQL types to populate the IRD type members (away from SQL C types). This cleans up the implementation a bit and will allows copying of the descriptors later, if needed.

- placed in a header, to easy find and change if needed

- this is a r/o IRD specific field => moved it to es_type; - implement the Display Size calculation.

- fixed/finished the logic for these two measures

edsavage

Looks pretty good to me Bogdan although I would prefer that load_es_types was refactored to aid readability

edsavage · 2018-05-01T08:40:32Z

driver/connect.c

+		goto end;
+	}
+
+	/* check row statues;


typo: statuses?

thanks. (these rows are indeed not that monumental.)

edsavage · 2018-05-01T08:41:27Z

driver/connect.c

+	}
+
+	/* check row statues;
+	 * calculate the lenght of all strings (SQLWCHAR members) returned


length.. (your favourite typo!)

thanks. (just ran a grep, this typo is occurring 40+ times..)

edsavage · 2018-05-01T08:55:02Z

driver/connect.c

@@ -1499,7 +2105,7 @@ SQLRETURN EsSQLGetConnectAttrW(
 		case SQL_ATTR_CURRENT_CATALOG:
 			DBGH(dbc, "requested: catalog name (@0x%p).", dbc->catalog);
 #if 0
-			if (! dbc->conn) {
+			if (! dbc->es_types) {


I'd prefer that we didn't have unreachable code blocks such as this one. Could it safely be removed?

I've changed this block to remove/update the dead code.

edsavage · 2018-05-01T09:23:40Z

driver/connect.c

+ * "manually" copy from a row structure (defined into the function) into the
+ * estype structure. The array of estype structs is returned.
+ */
+static BOOL load_es_types(esodbc_dbc_st *dbc)


This function is very long! Is there any it could be broken into smaller, more easily digestible chunks?

An initial suggestion would be to move the macro and structure definitions out - although I do realise that this would break the scoping model you've used here.

I've broken the function down.
I didn't want initially to have another structure/type declared, even though I didn't like the length of the resulted function either. But I think you're ultimately right, having it smaller/readable is a better solution.

droberts195

My main comment is just a question. As long as it's been thought through that's fine.

droberts195 · 2018-05-01T15:22:52Z

driver/handles.c

+ * comment at function end), so these values must stay in sync, since there
+ * are no C corresponding defines for verbose and sub-code (i.e. nothing like
+ * "SQL_C_DATETIME" or "SQL_C_CODE_DATE").
+ * The identity does not hold across the bord, though (extended values, like


typo: bord -> board

droberts195 · 2018-05-01T15:29:19Z

driver/defs.h

+#define ESODBC_SQL_NULL				0
+#define ESODBC_SQL_UNSUPPORTED		1111
+#define ESODBC_SQL_OBJECT			2002
+#define ESODBC_SQL_NESTED			2002


There are also geo and specialised types.

Do they get converted to "unsupported" on the Java side?

I believe the specialised types are sent as UNSUPPORTED currently, yes (at least the IP type is). Not sure where geo would fit, but I guess it's treated similarly.

- mostly s/lenght/length

The sign-explicit defs have been added to replace the implicitly signed ones.

Update the logic that gets the catalog name: there's a new SYS CATALOGS available system query that can be used, so the way to implement it in the driver is now clear.

BIGINT is a new type, which are all explicitely signed/unsigned. (not sure how it slipped in previous commit, past a pre-commit compile)

load_es_types() got too large.

Column data type handling

bpintea added 12 commits April 26, 2018 13:58

small build script improvements

769f9b9

- make the warnings visible - warn if no MSVC edition is found

add a BUGH handle log macro

dd08d18

+ BUGH()

b/f: don't count 0-term for strings

f2bba4f

When returning strings to the application, ODBC requires to include, but don't count the 0-terminator. This fixes the counting part (it was counted before)..

use the cached data types, remove them from recs

c6ac03c

make use of cached data types when returning info to application, strip them record structure.

reformatting: remove trailing white spaces in code

5f40aa1

reflect repo name change in README

122f758

s/x-pack-odbc/elasticsearch-sql-odbc/

.scale and .precision also read from cached type

8835ddd

Use cached ES return type for these record members, in case they belong to an IRD descriptor.

switch IRD to using pure SQL types

9e15f03

Use the values provided by server in cached ES/SQL types to populate the IRD type members (away from SQL C types). This cleans up the implementation a bit and will allows copying of the descriptors later, if needed.

make ES/SQL to SQL C data mapping clear

0f54a26

- placed in a header, to easy find and change if needed

add the 'display size' handling

3b5ef29

- this is a r/o IRD specific field => moved it to es_type; - implement the Display Size calculation.

'column size' and 'decimal digits' now correct

9d40dc1

- fixed/finished the logic for these two measures

bpintea requested review from droberts195 and edsavage April 30, 2018 19:11

bpintea mentioned this pull request Apr 30, 2018

TODOs #2

Closed

edsavage reviewed May 1, 2018

View reviewed changes

droberts195 reviewed May 1, 2018

View reviewed changes

bpintea added 5 commits May 2, 2018 12:57

typo fixes in comments or log messages

4b410c5

- mostly s/lenght/length

use the newer signed defs for type mapping

d80a79b

The sign-explicit defs have been added to replace the implicitly signed ones.

remove dead code and update catalog fetching case

52c0c59

Update the logic that gets the catalog name: there's a new SYS CATALOGS available system query that can be used, so the way to implement it in the driver is now clear.

b/f from prev commit: SQL_C_BIGINT doesn't exist

76b2491

BIGINT is a new type, which are all explicitely signed/unsigned. (not sure how it slipped in previous commit, past a pre-commit compile)

break load_es_types() in smaller functions

dead3da

load_es_types() got too large.

bpintea merged commit d4127f1 into elastic:master May 3, 2018

bpintea deleted the feature/column_data_type branch May 3, 2018 11:55

bpintea mentioned this pull request May 4, 2018

SQL: SYS TYPES adjustments for ODBC elastic/elasticsearch#30386

Closed

bpintea added a commit that referenced this pull request Jun 4, 2018

Merge pull request #5 from bpintea/feature/column_data_type

d6a3fc7

Column data type handling

bpintea added >feature Applicable to PRs adding new functionality v6.5.0 labels May 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Column data type handling #5

Column data type handling #5

bpintea commented Apr 30, 2018

edsavage left a comment

edsavage May 1, 2018

bpintea May 2, 2018

edsavage May 1, 2018

bpintea May 2, 2018

edsavage May 1, 2018

bpintea May 2, 2018

edsavage May 1, 2018

bpintea May 2, 2018

droberts195 left a comment

droberts195 May 1, 2018

bpintea May 2, 2018

droberts195 May 1, 2018

bpintea May 2, 2018

Column data type handling #5

Column data type handling #5

Conversation

bpintea commented Apr 30, 2018

edsavage left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

droberts195 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment